Exploiting Multiple Sources for Open-Domain Hypernym Discovery
نویسندگان
چکیده
Hypernym discovery aims to extract such noun pairs that one noun is a hypernym of the other. Most previous methods are based on lexical patterns but perform badly on opendomain data. Other work extracts hypernym relations from encyclopedias but has limited coverage. This paper proposes a simple yet effective distant supervision framework for Chinese open-domain hypernym discovery. Given an entity name, we try to discover its hypernyms by leveraging knowledge from multiple sources, i.e., search engine results, encyclopedias, and morphology of the entity name. First, we extract candidate hypernyms from the above sources. Then, we apply a statistical ranking model to select correct hypernyms. A set of novel features is proposed for the ranking model. We also present a heuristic strategy to build a large-scale noisy training data for the model without human annotation. Experimental results demonstrate that our approach outperforms the state-of-the-art methods on a manually labeled test dataset.
منابع مشابه
Supervised Distributional Hypernym Discovery via Domain Adaptation
Lexical taxonomies are graph-like hierarchical structures that provide a formal representation of knowledge. Most knowledge graphs to date rely on is-a (hypernymic) relations as the backbone of their semantic structure. In this paper, we propose a supervised distributional framework for hypernym discovery which operates at the sense level, enabling large-scale automatic acquisition of disambigu...
متن کاملBabelDomains: Large-Scale Domain Labeling of Lexical Resources
In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge. We propose an automatic method that uses knowledge from various lexical resources, exploiting both distributional and graph-based clues, to accurately propagate domain information. We evaluate our methodology intrinsically on two lexical resources (WordNet and Babe...
متن کاملFinding and Expanding Hypernymic Relations in the Music Domain
Lexical taxonomies are tree or directed acyclic graph-like structures where each node represents a concept and each edge encodes a binary hypernymic (is-a) relation. These lexical resources are useful for AI tasks like Information Retrieval or Machine Translation. Two main trends exist in the construction and exploitation of these resources: On one hand, general purpose taxonomies like WordNet,...
متن کاملUnsupervised Hypernym Detection by Distributional Inclusion Vector Embedding
Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, relation extraction, and question answering. Supervised learning from labeled hypernym sources, such as WordNet, limit the coverage of these models, which can be addressed by learning hypernyms from unlabeled text. Existing unsupervised methods either do not scale to large voca...
متن کاملExTaSem! Extending, Taxonomizing and Semantifying Domain Terminologies
We introduce EXTASEM!, a novel approach for the automatic learning of lexical taxonomies from domain terminologies. First, we exploit a very large semantic network to collect thousands of in-domain textual definitions. Second, we extract (hyponym, hypernym) pairs from each definition with a CRF-based algorithm trained on manually-validated data. Finally, we introduce a graph induction procedure...
متن کامل